Tutorial Part II — R goes Spatial
Let me start with a big thanks to Cedric Scherer, Moritz Wenzler, Pierre Gras and Juergen Niedballa who constantly helped to improve the course since its start in 2014! I am deeply indebted to you!
Introduction
The course is intended to be a quick introduction to using R as a GIS. It builds on the previous course, Course1_IntroR, and is written in such a way that you can understand handling spatial data when knowing the basics of handling data.frames, matrices and lists. There is no need to know tidyverse/dplyr coding style.
This course provides the basics for using the most important spatial data types — vector (points, lines, polygons, aka ESRI shapefiles) and raster data.
Recently, {sf} objects have been developed to handle
vector data more easily. Those objects can be treated like simple
data.frames. Also, there are more modern alternatives to
the {raster} — namely the {terra} and
{stars} packages. The course is updated to use the
{terra} package.
The {terra} package provides the same functionality
while merging some of the old classes and function names from the
{raster} and being much faster. It is also especially
useful for remote sensing data.
{terra} has a single class SpatRaster class
for which {raster} has three (RasterLayer,
RasterStack, RasterBrick). Likewise there is a
single class for vector data SpatVector that replaces six
Spatial* classes. Most method names are the same, but note
the following important differences in methods names with the
{raster} package:
In addition to different data types, the course provides sections on coordinate projections and the most important geo-spatial operations. Last but not least - we plot a lot of maps to ease data visualization and because it is fun.
The data we use stem from a longstanding project we run in Borneo. Please refer to our website https://ecodynizw.github.io/. The project involves species conservation and large scale landscape planning. To learn more about the data used in the course and the project, please refer to the following publications that are freely available:
Targeted conservation to safeguard a biodiversity hotspot from climate and land-cover change. MJ Struebig, et al. 2015. Current Biology 25 (3), 372-378. https://doi.org/10.1016/j.cub.2014.11.067
Anticipated climate and land‐cover changes reveal refuge areas for Borneo’s orang-utans. MJ Struebig, et al. 2015. Global Change Biology 21 (8), 2891-2904. https://doi.org/10.1111/gcb.12814
The importance of correcting for sampling bias in MaxEnt species distribution models. S Kramer‐Schadt et al. 2013. Diversity and Distributions 19 (11), 1366-1379. https://doi.org/10.1111/ddi.12096
The Borneo carnivore database and the application of predictive distribution modelling. S Kramer-Schadt, et al.2016. Raffles Bulletin of Zoology, Supplement No. 33: 18–41. https://lkcnhm.nus.edu.sg/app/uploads/2017/06/S33rbz018-041-1.pdf
Useful (web)sites
1. R news and tutorials
* http://www.r-bloggers.com/
2. Quick introduction to spatial R * https://rspatial.org/
* http://pakillo.github.io/R-GIS-tutorial/
* http://rstudio-pubs-static.s3.amazonaws.com/7993_6b081819ba184047802a508a7f3187cb.html
3. Spatial visualisation * check out the packages
{tmap}, {ggmap}, {leaflet} and
{cartography}
* https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html
* A whole book: https://clauswilke.com/dataviz/ with spatial data here:
https://clauswilke.com/dataviz/geospatial-data.html
4. Overview of spatial packages * http://cran.r-project.org/web/views/Spatial.html =
http://cran.r-project.org/view=Spatial
* https://www.r-spatial.org/
5. R-cheatsheets — great for learning
‘vocabulary’
* e.g. here: https://www.rstudio.com/resources/cheatsheets/
6. Infos about simple features in R ({sf}
package)
* https://r-spatial.github.io/sf/index.html
* and nice intro: https://oliviergimenez.github.io/introspatialR/#1
* geocomputation: https://geocompr.robinlovelace.net/spatial-operations.html
7. Lots of EPSG Codes
* Use EPSG codes as unique and specific identifies of your coordinate
reference systems instead of writing projection details.
* for EPSG codes see http://spatialreference.org/ or https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pd
8. BioMove specials: * For analysis of plot-based
biodiversity data: package {vegan}
* For analysis of telemetry data: packages {adehabitatHR},
{adehabitatLT}, {move},
{recurse}, {momentuHMM},
{moveHMM}, {ctmm}
Further reading
1. Check Roger Bivand’s publications:
* http://cran.r-project.org/web/views/Spatial.html
* http://www.csiss.org/learning_resources/index.html
* Roger Bivand et al. 2008, Applied Spatial Data Analysis in R,
Springer
2. Check Edzer Pebesma’s course materials and
publications:
* http://ifgi.uni-muenster.de/~epebe_01/
* https://edzer.github.io/UseR2017/
* https://www.rstudio.com/resources/videos/tidy-spatial-data-analysis/
3. Geocomputation with R: a open-access book of Robin
Lovelace, Jakub Nowosad & Jannes Muenchow
* https://geocompr.robinlovelace.net
4. DataVizArt Cedric Scherer -> check out his #TidyTuesday
or #30DaysMapChallenge contributions
* https://www.cedricscherer.com/top/dataviz/
* Codes: https://github.com/z3tt/TidyTuesday and https://github.com/z3tt/30DayMapChallenge
Basics
How to install…
pkgs <- c("terra", "stars", "sf", "sp", "rasterVis",
"ggplot2", "tmap", "viridis", "patchwork", "here", "units",
"devtools", "osmdata", "elevatr","tanaka","jpeg")
## install packages that are not installed yet
## (not important to understand the following code, just run it)
unavailable <- setdiff(pkgs, rownames(installed.packages()))
install.packages(unavailable)
## install development version of rnaturalearth as currently the
## download doesn't work in the CRAN package version
devtools::install_github("ropensci/rnaturalearth")
## install rgeoboundaries from GitHub (not available on CRAN yet)
devtools::install_github("wmgeolab/rgeoboundaries")…and load packages
## when working with raster data
## do NOT load both packages terra and raster at the same time,
## as this creates problems with the namespace
library(terra) ## the "new" {raster} package
#library(rgdal) # TODO: remove, outdated
#library(rgeos) # TODO: remove, outdated
library(rasterVis)
## working with vector data
library(sf) ## the "new" {sp} package
library(sp)
library(stars)
## visualization
library(ggplot2)
library(tmap)
library(viridis) ## nice colour palettes
library(patchwork) ## to combine plots
library(units) ## handle measurement data
library(here) ## for easy directory management
#install.packages("Rcpp", repos="https://rcppcore.github.io/drat")How to call help
## Information about a function or a package. If you do not understand some
## of the code below, always check the arguments and help of the function.
?terra
## search for an item:
??SpatialPolygons
## show the instructions of a package:
vignette(package = "sf") # vignette(package="sp")
## load the instruction of a package (embedded pdf):
vignette("sf1") # vignette("intro_sp")Assign the workspace
If you have downloaded the repository for the course as described
here (https://ecodynizw.github.io/teaching.html), you
automatically have adopted our folder structure. Nothing more to do!
Continue with chapter ‘R-projects and package
{here}’.
Optional - use own course folder setup
Optional - setting workspace by hand and learn about R-projects
If not: Set your root directory
Create the subfolders relative to root-folder. Please adjust to your setup. A possible folder structure could be:
.
└── <your folder name> ─ root folder , e.g. d6_teaching_collection
├── data ─ data
│ └── data_borneo ─ e.g., the Borneo data
│ ├── geo_raster_current_asc ─ geo data, raster ascii format, as in data_borneo
│ └── animal_data
└── R ─ store here all your scripts, i.e.
├── my_script_course1.R
└── my_script_course2.RTraditionally, the working directory was hard coded, i.e. the full path was specified:
## adjust the working directory [wd]
## getwd() ## alternative fct
##
## ## Set your OWN root directory, the following is just an example of how to do it:
## #root_wd <- setwd("C:/Users/kramer/d6_teaching_collection")This approach is error-prone and it complicates the cooperation with others as they have to update the working directory in every script they receive from you. (And if they don’t change it back to your directory, you also have to update it again…)
Nowadays, the best approach is to work with R projects that are associated with own working directory, workspace, history, and source documents.
You can create a project in the RStudio IDE by using the
Create Project command (available on the Projects menu and
on the global toolbar). In our case, we want to create a project in an
existing directory where there is already R code and data placed inside
that said folder. Now, a hidden folder called .Rproj.user
and most importantly a file called your_project.Rproj
should appear in your folder. Every time you want to work on your
project, you open the Rstudio session by double-clicking that
.Rproj file, which ensures that the root working directory
is correctly set.
The goal of the {here} package is to enable easy file
referencing in project-oriented workflows. In contrast to using
setwd(), which is fragile and dependent on the way you
organize your files, here uses the top-level directory of a project to
easily build paths to files.
The here() function from the {here} package
searches for the .Rproj file and will allow you to
construct relative paths easily within your project environment:
## [1] "/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection"
here::dr_here() # because here() alone interferes with the package `purrr` if that is loaded
root_wd <- here::here() # the root folder is automatically set
root_wd ## [1] "/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection"
Now, we assign the path to these folders relative to the root directory, and we work with the Borneo data:
.
└── root_wd ─ e.g. d6_teaching_collection
└── dbor_wd ─ data / data_borneo
├── maps_wd ─ geo data
└── anim_wd ─ animal location data## today, we start with Borneo
#dbor_wd <- paste0(root_wd, "/", "data/data_borneo") #- the old way
dbor_wd <- here("data","data_borneo") #- note the nested folder structure
# maps_wd <- here("data","data_borneo","geo_raster_current_asc") ## same as:
maps_wd <- paste0(dbor_wd, "/geo_raster_current_asc")
#- mind difference paste0() and paste()
vecs_wd <- paste0(dbor_wd, "/geo_vector")
anim_wd <- paste(dbor_wd, "/animal_data", sep = '') #- paste() needs a separator signCreate an output folder where you can temporarily store your results (anything you plot, create, want to save,….) with dir.create. If you use our downloaded course repository, you already have this folder.
#output_wd <- paste0(dirname(root_wd), "/", "output") #- the old way
output_wd <- here("output")
if (!dir.exists(output_wd)) dir.create(output_wd) #- create only if directory does NOT! existFinally, your folder structure will look like this:
.
└── <your folder name> ─ root folder , e.g. d6_teaching_collection
├── data ─ data
│ ├── data_borneo ─ e.g., the Borneo data
│ ├── geo_raster_current_asc ─ geo data, raster ascii format, as in data_borneo
│ └── animal_data
├── R ─ store here all your scripts, i.e.
│ ├── my_script_course1.R
│ └── my_script_course2.R
├── output ─ for any results/ plots,...
└ <your R project name.Rproj> ─ in case you are working with an R-ProjectA small yet important hint on file and folder naming: Do NOT! use dots (NOT: folder.name), comma, semicolon, or any special signs like ö, ä, ü, ß etc. in your folder names, and keep name length at a maximum of 13 letters!
Access folder content
You will find a lot of bioclimatic raster files (.asc) in the
geo_raster_current_asc folder, which we named
R-object maps_wd above:
## [1] "bio_asc_01.asc" "bio_asc_21.asc" "bio_asc_22.asc" "bio_asc_24.asc" "bio_asc_27.asc" "bio_asc_42.asc"
Should this be empty, then you have wrongly set the root directory and the data directory. Please check again how your folder structure looks like. By default, this folder should contain six ascii-files. The description is provided in the word.doc in the folder.
Recap from Course 1 Basics in R
useful functions:
head(): only shows the first 6 elements, i.e. if you have a large data frame, you can quickly check header and the first entries (analogously:tail())paste(x, sep=''): appends characters or numbers — great for working directories, loops across many maps or data.frames etc
data formats and access:
data.frame(): [row, column] or “dollar sign”, i.e.mydata[1,2]→ first row, second column, ormydata$MyColumnName[1]matrix(): see abovelist(): [[element]], i.e.mylist[[1]]. Attention! A list element can consist of a data.frame that you can then access like that:mylist[[1]][1,2]…
S4-objects (like spatial data)
- access via
@, i.e.RasterLayer@data@values
Raster data
Data import
First, load a raster map. Please check the description in the folder called DataDescription_Borneo.doc. The file bio_asc_01.asc is showing mean annual temperature multiplied by 10. Why? Because it can be stored as integer, not floating type (i.e. with digits), which saves a lot of storage and memory.
## ## read the ascii file as raster format
ras_bio_asc_01 <- rast(x = paste0(maps_wd, "/bio_asc_01.asc")) ## `raster()` with {raster}
## or use here:
ras_bio_asc_01 <- rast(x = here("data", "data_borneo","geo_raster_current_asc", "bio_asc_01.asc"))
Bor_mat <- ras_bio_asc_01 ## easy copying of whole maps; mat stands for mean annual tempNow have a look at the object:
## class : SpatRaster
## dimensions : 1425, 1423, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84
## source : bio_asc_01.asc
## name : bio_asc_01
What does the slot with extent tells you? Looks like decimal degrees, i.e. could be latitude and longitude.
Now have a look at the slot crs (CRS stands for
coordinate reference system): Nicely in the new
{terra}-package, the Coordinate Reference System (map
projection) is automatically set in the following way: If this argument
is missing when loading data with rast(x= ..., crs= ...),
and the x coordinates are within -360 .. 360 and the y coordinates are
within -90 .. 90, “+proj=longlat +datum=WGS84” is used.
However, sometimes the slot CRS is empty, i.e. if other coordinates than decimal degrees are used, e.g. UTM. That means, without knowing the reference system, the map cannot be plotted at the exact location on the globe. Thus, when you get a map, always make sure you also get the information of the coordinate reference system! If it is not yet set/ assigned (because you import a raster map as a simple matrix or points from a file), this is the first thing you should do!
# str(ras_bio_asc_01) ## check this
crs(ras_bio_asc_01) ## crs = coordinate reference system. If empty, assign it!## [1] "GEOGCRS[\"WGS 84\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1]],\n ID[\"EPSG\",6326]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8901]],\n CS[ellipsoidal,2],\n AXIS[\"longitude\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]],\n AXIS[\"latitude\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]]]"
Assign CRS
Important! Please read and repeat basic GIS-knowledge on coordinate reference systems (CRS), map projections, and geographic and projected coordinates.
If a map already has a defined CRS, then you cannot simply overwrite
this! To re-calculate the coordinates from one system into another
system (e.g. from angular coordinates of latitude/ longitude of WGS84 to
a projected CRS with planar coordinates and meter distances, equal area
of Lambert Azimuthal), you need to transform it with the
function project()(raster-data) or
st_transform()(vector-data) (see below chapter
Raster Projection).
In the following, we assign the CRS only when no information is provided. This can be the case when you upload a map based on a simple data frame or matrix, i.e. when it is not a spatial object.
crs(<raster>) gives the information of the object
(here: raster layer) Bor_mat or
ras_bio_asc_01, and is also used to set the CRS of the
object via assignment:
crs(<raster>) <- <projection>.
Mind the spelling in small letters/ capitals!
crs(<raster>)shows coordinate reference system of a spatial objectcrs(<raster>) <- <projection>creates projection and sets arguments for CRS!
An easy way to specify the CRS is using the EPSG code, e.g. instead
of "+proj=longlat +datum=WGS84" use
"+init=epsg:4326".
See http://spatialreference.org/. The EPSG code for WGS84 is 4326.
In case the CRS was not yet assigned:
# many ways to assign the CRS, options listed below:
crs(ras_bio_asc_01) <- "+proj=longlat +datum=WGS84"
crs(ras_bio_asc_01) <- "+init=epsg:4326"Projections can also be transferred from one object to another, but only when the CRS hasn’t yet been specified.
In the following, load the topographic map (digital elevation model)
of Borneo (Bor_dem.asc or bio_asc_24.asc) and
assign it the same CRS as Bor_mat, because we know they are
overlaying (you can check that with the raster extents):
## ras_bio_asc_24 = DEM = digital elevation model
ras_bio_asc_24 <- rast(x = paste0(maps_wd, "/bio_asc_24.asc")) ## `raster()` in {raster}
crs(ras_bio_asc_24)## [1] "GEOGCRS[\"WGS 84\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1]],\n ID[\"EPSG\",6326]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8901]],\n CS[ellipsoidal,2],\n AXIS[\"longitude\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]],\n AXIS[\"latitude\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]]]"
Clip areas
We do this first to work with a small raster to save computation time. For this, we create a clip_extent based on the spatial coordinates. The command ‘crop’ then clips the raster map.
## SpatExtent : 108.33412288693, 120.19245688433, -4.3740129986359, 7.5009876663641 (xmin, xmax, ymin, ymax)
clip_extent <- ext(117.2, 117.45, 5.4, 5.5)
ras_bio_asc_01_cr <- crop(x = ras_bio_asc_01, y = clip_extent)
plot(ras_bio_asc_01_cr, col = viridis::inferno(10))Accessing RasterLayer
First, have a look at the internal data structure of the raster object:
## class : SpatRaster
## dimensions : 12, 30, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 117.2008, 117.4508, 5.400988, 5.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84
## source(s) : memory
## varname : bio_asc_01
## name : bio_asc_01
## min value : 261
## max value : 268
## get additional information:
# str(ras_bio_asc_01_cr)
# attributes(ras_bio_asc_01_cr)
# class(ras_bio_asc_01_cr)There are many ways to retrieve internal data and access single bits of information:
## SpatExtent : 117.20079005013, 117.45079006413, 5.4009875487641, 5.5009875543641 (xmin, xmax, ymin, ymax)
## [1] 117.2008
## [1] 30
## bio_asc_01
## 1 265
## 2 265
## 3 265
## 4 266
## 5 267
## 6 267
## [1] "GEOGCRS[\"WGS 84\",\n ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n MEMBER[\"World Geodetic System 1984 (Transit)\",\n ID[\"EPSG\",1166]],\n MEMBER[\"World Geodetic System 1984 (G730)\",\n ID[\"EPSG\",1152]],\n MEMBER[\"World Geodetic System 1984 (G873)\",\n ID[\"EPSG\",1153]],\n MEMBER[\"World Geodetic System 1984 (G1150)\",\n ID[\"EPSG\",1154]],\n MEMBER[\"World Geodetic System 1984 (G1674)\",\n ID[\"EPSG\",1155]],\n MEMBER[\"World Geodetic System 1984 (G1762)\",\n ID[\"EPSG\",1156]],\n MEMBER[\"World Geodetic System 1984 (G2139)\",\n ID[\"EPSG\",1309]],\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",7030]],\n ENSEMBLEACCURACY[2.0],\n ID[\"EPSG\",6326]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8901]],\n CS[ellipsoidal,2],\n AXIS[\"longitude\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]],\n AXIS[\"latitude\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"World.\"],\n BBOX[-90,-180,90,180]]]"
## crs = coordinate reference system defined
## CRS creates projection and takes args for crs!
## e.g.
wgs84_crs_args <- CRS("+proj=longlat +datum=WGS84")
# wgs84_crs_args ## please check## [1] 12
## [1] 12 30 1
## [1] 0.008333334 0.008333334
An important function is crds()
(coordinates() in the {raster} package), which
returns the centroids of each raster cell:
## x y
## [1,] 117.2050 5.496821
## [2,] 117.2133 5.496821
## [3,] 117.2216 5.496821
## [4,] 117.2300 5.496821
## [5,] 117.2383 5.496821
## [6,] 117.2466 5.496821
## [7,] 117.2550 5.496821
## [8,] 117.2633 5.496821
## [9,] 117.2716 5.496821
## [10,] 117.2800 5.496821
Terrain computation
We will now work with the digital elevation model (DEM)
ras_bio_asc_24 and create a hillshade (3D look) based on
topography using slope and aspect: - simulates a 3D surface - computes
shaded relief values for a raster surface
slope <- terrain(x = ras_bio_asc_24, v = "slope", unit = "radians", neighbors = 8) ## arg `v` called `opt` in {raster}
aspect <- terrain(x = ras_bio_asc_24, v = "aspect", unit = "radians", neighbors = 8) ## arg `v` called `opt` in {raster}
Bor_hs <- shade(slope, aspect, angle = 45, direction = 270) ## `hillShade()` in {raster}
## {raster} equivalent:
#slope <- terrain(x = ras_bio_asc_24, opt = "slope", unit = "radians", neighbors = 8)
#aspect <- terrain(x = ras_bio_asc_24, opt = "aspect", unit = "radians", neighbors = 8)
#Bor_hs <- hillShade(slope, aspect, angle = 45, direction = 270)Let’s plot the hillshade:
Other very useful terrain calculations are cost surfaces, cost distances and least cost path, e.g. for corridor calculations. We will do that in another course.
Visualising rasters
Difference between plot and image
plot()
plot keeps the proportion of the map / cells (e.g. squares here).
However, if you plot several layers on top of each other, plot will not overlay them exactly, because the argument ‘add’ fails (example below). In that case, better use image, or any other plot function.
raster::image()
If you use image, the plotted proportions will be changed and skewed to the plot surface.
{tmap} package
Static map
If you want to use a static map as high quality plots for your
publication, save the output via tmap_save(). Make sure
that the quality is sufficient by setting the dpi (“dots per inch”) to
at least 600. We save this map in our output folder.
tmap_mode("plot")
(m <- tm_shape(ras_bio_asc_01) +
tm_raster(palette = "viridis", title = "Mean annual temp"))tmap_save(m, paste0(output_wd, "/BorneoMap_4326_tmap.png"),
units = "mm", width = 90, height = 90, dpi = 600)The saved map: check output folder
Interactive map
Very good for checking the correct location of the data. Click on the + | - | layer icons on the upper left and zoom in or out and change the background layers. The ‘alpha’ argument makes the layer transparent.
If you like, you can save the interactive map — check your output folder for the html file. Tip of the day: always save the epsg code of your crs at the end of a spatial file name.
{ggplot2} package
optional; for advanced R-users: Beautiful plots can also be made with
{ggplot2}; however, RasterLayer objects cannot
directly be plotted. One can use the {stars}
package.
The {stars}
package is another package aimed at working with spatial data,
namely spatiotemporal arrays (such as raster and vector data cubes). We
first turn our raster into a stars object using function
st_as_stars and can then use the geom_stars()
function in combination with ggplot:
## [1] "stars"
g <- ggplot() +
geom_stars(data = stars_bio_asc_01) +
scale_fill_viridis_c(name = "°C * 10", na.value = "transparent") + ## set custom colors for NA
coord_sf(crs = "+init=epsg:4326") + ## set correct projection
labs(x = "Longitude", y = "Latitude",
title = "Mean annual temperature") +
theme_minimal() ## set custom plot style
gThe nice thing about ggplot is the flexibility to customize your plot
further. Thus it is very suitable to create static high-quality maps for
publication which can be saved via ggsave(). Make sure that
the quality is sufficient by setting the dpi (“dots per inch”) to at
least 600.
g2 <- g + theme(plot.title = element_text(size = 18, face = "bold"),
plot.title.position = "plot",
legend.position = c(0.2, 0.95),
legend.direction = "horizontal",
legend.key.width = unit(2.2, "lines"),
legend.key.height = unit(0.7, "lines"))
g2ggsave(filename = paste0(output_wd, "/BorneoMap_4326_ggplot.png"),
width = 8, height = 7, dpi = 600, bg = "white")(ggsave() saves automatically the last ggplot. You can
also specify a ggplot object via plot =).
The saved map: check output folder
Composite plots
Simple composite plot
Plot the hillshade (3D relief) and add temperature colours on top: alpha value gives semi-transparency. The extent is plotted as a red box, and the centroid coordinates of the raster cells are plotted as points.
image(Bor_hs, col = grey(0:100/100))
image(ras_bio_asc_24, col = terrain.colors(25, alpha = 0.3), add = TRUE)
points(crds(ras_bio_asc_01_cr), cex = 0.1, pch = '+')
plot(ext(ras_bio_asc_01_cr), add = TRUE, col = 'red')Now open the image-plot with the zoom icon in a separate window and change the size. You will see that the different layers are still plotted on top of each other.
Now do the same with the base plot function:
plot(Bor_hs, col = grey(0:100/100), legend = FALSE)
plot(ras_bio_asc_24, col = terrain.colors(25, alpha = 0.3), add = TRUE)
points(crds(ras_bio_asc_01_cr), cex = 0.1, pch = '+')
plot(ext(ras_bio_asc_01_cr), add = TRUE, col = 'red')You will see that zoom does not work here very well, the layers do not overlap any more. Nb: always use a specified plotting function for spatial data.
Composite plot with {tmap}
In the following, we will create a SpatialPointDataFrame
from the small clipped/ cropped raster with the {sf}
package and plot it on top of the hillshade. Use the possibility of
selecting/ disregarding different layers (layer icon on the left).
hillsh <- Bor_hs
# make sf object from coordinates -> you will learn that also later
ras_bio_asc_01_cr_sf <- st_as_sf(
data.frame(crds(ras_bio_asc_01_cr)),
## create dataframe of coordinates; `coordingates()` in {raster}
coords = c("x", "y"), ## define columns for the coordinates
crs = 4326, ## define crs, 4326 is the EPSG code
sf_column_name = "geometry" ## sf needs a geometry column and you have to name it
)
# interactive plot
tmap_mode(mode = "view")
tm_shape(shp = hillsh, raster.downsample = TRUE) +
tm_raster(palette = "Greys") +
tm_shape(shp = ras_bio_asc_24, raster.downsample = TRUE) +
tm_raster(palette = grDevices::topo.colors(20),alpha = 0.3) +
tm_shape(shp = ras_bio_asc_01, raster.downsample = TRUE) +
tm_raster(palette = grDevices::rainbow(10), alpha = 0.3) +
tm_shape(shp = ras_bio_asc_01_cr_sf) +
tm_dots(shape = 20, size = 0.01) What do the shape-arguments mean, e.g. shape = 20? Have
a look here at the symbols: https://ggplot2.tidyverse.org/articles/ggplot2-specs.html?q=shape#sec:shape-spec
Unfortunately, in {tmap} you can only use
shape = 1 and shape = 20, only circles are
plotted.
# static plot
tmap_mode(mode = "plot")
tm_shape(shp = hillsh, raster.downsample = TRUE) +
tm_raster(palette = "Greys") +
tm_shape(shp = ras_bio_asc_24, raster.downsample = TRUE) +
tm_raster(palette = grDevices::terrain.colors(10), alpha = 0.3) +
tm_shape(shp = ras_bio_asc_01_cr_sf) +
tm_dots(shape = 1, size = 0.05) persp() (3D plot)
# Cool 3D plots with rgl library, e.g. 'rgl.surface'
persp(x = ras_bio_asc_24, xlab = "Easting", ylab = "Northing",
zlab = "elevation", r = 5, d = 1.5, expand = 0.1,
ticktype = "detailed")Why is the map so spiny? Check units! This is still a geographic CRS, not a projected one. The units are in degrees. Solution: project to a geographic crs (chapter: Raster projection).
Let’s work with the cropped map. Check out the options:
library(tanaka)
tanaka(ras_bio_asc_01_cr,
legend.pos = "bottom",
legend.title = "mean ann temp\n(°C * 10)")Also possible with ggplot: https://eliocamp.github.io/metR/reference/geom_contour_tanaka.html
Raster projection
Project the grid (the small one!!) to have all units in meters:
Bor_dem_moll <- terra::project(ras_bio_asc_01_cr, "+proj=moll +lat_0=65 +lon_0=10")
## `projectRaster()` in {raster}
persp(Bor_dem_moll, xlab = "Easting", ylab = "Northing",
zlab = "elevation", main = "Elevation model of Borneo",
r = 1, d = 5.5, expand = 0.1, ticktype = "detailed")Raster Stacks
A raster stack is a collection of RasterLayer objects
with the same spatial extent and resolution, similar to a geodatabase.
Go into the maps folder and check what is inside:
# gives names and full path of file
files.full <- list.files(path = maps_wd, pattern = '.asc$', full.names = TRUE)
# files.full # check also
files.full[1:3]## [1] "/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/geo_raster_current_asc/bio_asc_01.asc"
## [2] "/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/geo_raster_current_asc/bio_asc_21.asc"
## [3] "/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/geo_raster_current_asc/bio_asc_22.asc"
# names only
files.rel <- list.files(path = maps_wd, pattern = '.asc$', full.names = FALSE)
files.rel[1:3]## [1] "bio_asc_01.asc" "bio_asc_21.asc" "bio_asc_22.asc"
Working with stacks
The advantage is, that you do not need to apply a command to each
raster map separately, but can do it ‘all in one’. E.g., We can set the
crs of each single raster in just one line. In the following, we
stack four maps in a ‘geodatabase’ called ‘predictors’. Check
the .doc for a description of the layers. In the following, we select 4
spatial layers, numbers 9, 12, 22 and 24. The description of what they
represent is in the data description document in the map folder. We load
them all together with terra::rast()
## [1] "GEOGCRS[\"WGS 84\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1]],\n ID[\"EPSG\",6326]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8901]],\n CS[ellipsoidal,2],\n AXIS[\"longitude\",east,\n ORDER[1],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]],\n AXIS[\"latitude\",north,\n ORDER[2],\n ANGLEUNIT[\"degree\",0.0174532925199433,\n ID[\"EPSG\",9122]]]]"
A raster stack contains the single raster layers in a list:
## class : SpatRaster
## dimensions : 1425, 1423, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
## source : bio_asc_01.asc
## name : bio_asc_01
## class : SpatRaster
## dimensions : 1425, 1423, 1 (nrow, ncol, nlyr)
## resolution : 0.008333334, 0.008333334 (x, y)
## extent : 108.3341, 120.1925, -4.374013, 7.500988 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
## source : bio_asc_01.asc
## name : bio_asc_01
Plotting just one layer:
Plotting stacks
With base plot
With {tmap}, plotting all stack layers at once can be a
problem:
This is standardizing the legend to the min and max of all layers, which is problematic if maps are not in the same units. That’s why you cannot see the information in the maps:
Extract values from stacks
With the help of the {rasVis} package we can plot violin
charts to visualize the summary statistics:
Again, these do not look nice because of the different axis scaling.
For advanced users: More beautiful violin plots can be found in
ggplot2. For this, we need to transform the RasterLayer
data into a data.frame first:
## first omit NA values (which represent the ocean around Borneo)
raster_df <- na.omit(data.frame(values(predictors[[c(1,3)]])))
raster_names <- names(raster_df)
raster_ct <- dim(raster_df)[1]
df2 <- data.frame(val = c(raster_df[,names(raster_df)[1]],
raster_df[,names(raster_df)[2]]))
df2$grp <- rep(raster_names , each = raster_ct)
head(df2)## val grp
## 1 271 bio_asc_01
## 2 271 bio_asc_01
## 3 270 bio_asc_01
## 4 270 bio_asc_01
## 5 270 bio_asc_01
## 6 270 bio_asc_01
## take a random subsample of the data to not crash your PC when plotting:
a <- sample(x = nrow(df2), size = 1000, replace = FALSE)
df3 <- df2[a,]## a violin-box plot combination with raw data strips
p <- ggplot(data = df3, aes(x = grp, y = val)) +
geom_violin(scale = "width", fill = "grey85", color = "#3366FF", bw = 20) +
geom_boxplot(width = 0.15, size = 0.8, outlier.color = NA) + ## remove outliers
geom_jitter(height = 0, width = 0.05, alpha = 0.2, size = 1.5, color = 'blue')
pSince the units are so different, it makes more sense in this case to
plot them separately, using a facet_wrap() and
scales = free.
p +
facet_wrap(vars(grp), scales = "free") +
scale_x_discrete(guide = "none") + ## remove axis ticks and labels on x
labs(x = NULL, y = "Value") +
theme_minimal(base_size = 15) ## set custom plot style# save the last plot
ggsave(paste0(output_wd, "/savedggplot.pdf"), width = 5, height = 5, dpi = 600)
## or use here:
# ggsave(here("output", "savedggplot.pdf"), width = 5, height = 5, dpi = 600)…or plot them separately and combine them using the {patchwork}
package:
df3_bio1 <- subset(df3, df3$grp == 'bio_asc_01')
df3_bio24 <- subset(df3, df3$grp == 'bio_asc_24')
p1 <- ggplot(data = df3_bio1, aes(x = grp, y = val)) +
geom_violin(scale = "width", fill = "grey85", colour = "#3366FF") +
geom_boxplot(width = 0.2, size = 0.8, outlier.color = NA) + ## remove outliers
geom_jitter(height = 0, width = 0.05, alpha = 0.2, size = 1.5, colour = "#3366FF") +
labs(x = NULL, y = "Value")
p2 <- ggplot(data = df3_bio24, aes(x = grp, y = val)) +
geom_violin(scale = "width", fill = "grey85", colour = "#3366FF") +
geom_boxplot(width = 0.2, size = 0.8, outlier.color = NA) + ## remove outliers
geom_jitter(height = 0, width = 0.05, alpha = 0.2, size = 1.5, colour = "#3366FF") +
labs(x = NULL, y = "")
## multipanel plot with {patchwork}
(p1 + p2) * theme_minimal(base_size = 15) ## apply custom stylecheck here for how {ggplot2} works if you want to make
really nice plots: * https://github.com/rstudio/cheatsheets/blob/master/data-visualization-2.1.pdf
or get inspiration here:
- https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ *https://www.r-graph-gallery.com/
Back to spatial R!
Retrieve metrics/ statistics from the rasters:
#- Extract information from all rasters in one command
# cellStats(predictors, 'mean')
round(x = global(x = predictors, stat = 'mean', na.rm = TRUE), digits = 2) ## mean
## bio_asc_01 255.61
## bio_asc_21 0.02
## bio_asc_24 251.75
## bio_asc_42 10.65
Manipulating rasters
Do any kind of raster algebra:
For changing cell size use aggregate() or resample():
# collapse 20*20 cells into 1 using function 'mean':
ras_bio_asc_01_agg <- aggregate(x = ras_bio_asc_01, fact = 20, fun = mean) We finally plot the two new rasters next to each other:
par(mfrow = c(1,2)) # recap from course 1: plots in 1 row and 2 columns
plot(x = new_ras, col = rev(rainbow(5)))
plot(ras_bio_asc_01_agg)Change and query raster values
# convert ras_bio_asc_01 to degree Celsius units (divide by 10)
range(values(ras_bio_asc_01), na.rm = TRUE)## [1] 63 278
## [1] 6.3 27.8
Have a look at the result using a binary plot:
Accessing cell values
Remember the function extract(). In the
following, we extract from the small raster the first three values (=
cols 1-3; longitude or x-axis) at line 5 (= row 5; latitude or
y-axis).
cells <- cellFromRowCol(object = ras_bio_asc_01_cr, row = 5, col = 1:3)
cells ## Note: returns cell ID number, the index / rownumber! Not the value!!!!## [1] 121 122 123
## bio_asc_01
## 1 266
## 2 265
## 3 266
## plot it:
plot(x = ras_bio_asc_01_cr)
## to plot the points on top, insert the 'cells' index into
## the data frame of the coordinates of the RasterLayer object ras_bio_asc_01_cr
points(x = crds(ras_bio_asc_01_cr)[cells,], col = "blue")Advantage of working with stacks: Extract raster values from all rasters at once (raster stack):
This is very important if you want to create your master table, i.e. based on the xy-coordinates of your species sightings (i.e. GPS point locations), you could extract the important environmental predictors at that location.
We do that for an area in the center of Borneo. For this, we first
get the row and column indices of the center of a large Borneo map,
e.g. ras_bio_asc_01, and 5 x 5 cells in addition, i.e. a
square with an area of 5 km x 5 km:
center_x = floor(nrow(predictors) / 2) ## learn about the functions round(),
center_y = floor(ncol(predictors) / 2) ## ceiling(), floor()
center_x; center_y ## center coordinates of the Borneo maps## [1] 712
## [1] 711
stack_cells <- cellFromRowCol(
object = ras_bio_asc_01,
row = center_x:(center_x + 5),
col = center_y:(center_y + 5)
)
head(stack_cells) ## mind: these are index numbers! Not cell values!## [1] 1012464 1013888 1015312 1016736 1018160 1019584
Why are these index numbers so large? Because the counting starts at
the upper left corner of the map and continues consecutively, i.e. the
data (ras_bio_asc_01@data@values) are one huge vector with
length nrow * ncol!
Now, extract them with the useful function èxtract()into
an object called ‘mastertable’ that you can for example use for
statistical analyses in R:
## bio_asc_01 bio_asc_21 bio_asc_24 bio_asc_42
## 1 255 0.000000000 303 1
## 2 249 0.008333334 352 1
## 3 243 0.016666669 439 1
## 4 234 0.025000000 702 2
## 5 234 0.033333339 588 2
## 6 221 0.041666672 881 2
## stack_cells bio_asc_01 bio_asc_21 bio_asc_24 bio_asc_42
## 1 1012464 255 0.000000000 303 1
## 2 1013888 249 0.008333334 352 1
## 3 1015312 243 0.016666669 439 1
## 4 1016736 234 0.025000000 702 2
## 5 1018160 234 0.033333339 588 2
## 6 1019584 221 0.041666672 881 2
Change raster values:
## get the coordinates; cells was the object containing the index (i.e. the cell numbering)
## the function returns a matrix (it could have been easy...)
xy <- xyFromCell(object = ras_bio_asc_01_cr, cell = cells)
# coordinates(ras_bio_asc_01_cr)[cells,] ## in {raster} package
xy## x y
## [1,] 117.2050 5.463488
## [2,] 117.2133 5.463488
## [3,] 117.2216 5.463488
## [1] "matrix" "array"
## bio_asc_01
## 1 266
## 2 265
## 3 266
# extract(ras_bio_asc_01_cr, cells) ## in {raster} package
## Change values, e.g. for adding forest or creating a corridor
## take care! -> irreversible! better work on a copy!
copy_ras <- ras_bio_asc_01_cr
copy_ras[cells] <- 250 # here we set all values in the raster at the position index cells to 250
plot(x = copy_ras, col = viridis(20))Compute distance to points
## turn first into SpatVector - this is the format needed for the distance() function below
sv_xy <- vect(xy)
sv_xy## class : SpatVector
## geometry : points
## dimensions : 3, 0 (geometries, attributes)
## extent : 117.205, 117.2216, 5.463488, 5.463488 (xmin, xmax, ymin, ymax)
## coord. ref. :
Check the object sv_xy: since xy was a
simple matrix and not a spatial object, there is no crs assigned. The
slot for ‘coord.ref.’ is empty. That means, we now have to first assign
the CRS so that the following functions know where exactly the different
layers are on earth.
## class : SpatVector
## geometry : points
## dimensions : 3, 0 (geometries, attributes)
## extent : 117.205, 117.2216, 5.463488, 5.463488 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326)
Now the CRS is set/ assigned and we can continue with distance calculations:
##
|---------|---------|---------|---------|
=========================================
## `distanceFromPoints(object = ras_bio_asc_01_cr, xy = xy)` in {raster}
plot(x = my_dist) ## units?
points(xy)## nicer plot adding the points from which a distance should be computed.
## please ignore the following 4 lines, just run them,
## you will learn that when working with vector data below
xy_sf <- sf::st_as_sf(x = sv_xy)
tmap_mode(mode = "plot")
tm_shape(shp = my_dist) +
tm_raster(n = 100, palette = rev(grDevices::terrain.colors(100)),
legend.show = FALSE) +
tm_shape(shp = xy_sf) +
tm_dots(size = 1)Calculate distance from many points:
cells1 <- c(cells, 250, 360) ## add two more points to cells = cells1
sv_xy_2 <- vect(xyFromCell(ras_bio_asc_01_cr, cells1)) ## we create a new spatVect
crs(sv_xy_2) <- "epsg:4326" ## same procedure as above
my_dist <- distance(x = ras_bio_asc_01_cr, y = sv_xy_2)##
|---------|---------|---------|---------|
=========================================
Data export - saving raster data
Save the raster (not the plot…): There are many exchange formats for rasters. The best choice is considered to be GeoTiff, which also saves the projection and is smaller than ascii. However, for modelling e.g. in MaxEnt, the raster maps are needed in ascii-format.
## save the small cropped file
terra::writeRaster(x = ras_bio_asc_01_cr,
filename = paste0(output_wd,"/bor_crop.asc"),
# or use here(): here("output", "bor_crop.asc")
overwrite = TRUE,
NAflag = -9999)
terra::writeRaster(x = ras_bio_asc_01_agg,
filename = paste0(output_wd,"/ras_bio_asc_01_agg.asc"),
# or use here(): here("output", "ras_bio_asc_01_agg.asc")
overwrite = TRUE,
NAflag = -9999)Vector data / shapefiles
Polygons and lines
Import shapefiles
Currently, there are two packages sp and sf (standing for simple features), both with still important functionality. However, sf is much easier to use and handle. The suggestion currently is: learn both!
In the following, the most important commands are provided for the sf-package, and if necessary, also for the sp-package:
## Border of countries and provinces of Borneo
## - only loading columns 1:3, 5, 7, 17, 18 of attribute table:
Borneo_shp <- st_read(dsn = vecs_wd, layer = "borneo_admin")[, c(1:3, 5, 7, 17, 18)] ## Reading layer `borneo_admin' from data source
## `/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 158 features and 18 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5986 ymin: -4.943054 xmax: 119.2697 ymax: 7.380556
## Geodetic CRS: WGS 84
## Protected Areas (National Parks, Nature Reserves, Forest Reserves)
PA_shp <- st_read(dsn = vecs_wd,
layer = "Bor_PA")[, c(1:4)]## Reading layer `Bor_PA' from data source
## `/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 255 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5969 ymin: -4.18356 xmax: 119.6176 ymax: 9.911229
## Geodetic CRS: WGS 84
## fix problematic polygons
PA_shp <- st_make_valid(PA_shp)
## main rivers
River_shp <- st_read(dsn = vecs_wd, layer = "sn_100000")## Reading layer `sn_100000' from data source
## `/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 396 features and 4 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 108.9454 ymin: -3.787917 xmax: 118.7879 ymax: 6.464583
## Geodetic CRS: WGS 84
## Reading layer `borneo_admin' from data source
## `/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/geo_vector'
## using driver `ESRI Shapefile'
## Simple feature collection with 158 features and 18 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5986 ymin: -4.943054 xmax: 119.2697 ymax: 7.380556
## Geodetic CRS: WGS 84
Working with vector data
Accessing simple feature objects — easy handling, similar to
data.frames:
## Please note the similarity to accessing info from rasters.
str(object = ext(PA_shp)) ## `extent()` in {raster}## S4 class 'SpatExtent' [package "terra"]
## [1] 108.5969
## [1] "SITE_ID" "NAME_ENG" "COUNTRY" "SUB_NAT" "geometry"
Accessing SpatialPolygonsDataFrames of the former
{sp}-package:
# TODO: outdated as readOGR was replaced by st_read
## Much more complex than sf-objects.
class(Admin_shp)## [1] "sf" "data.frame"
## S4 class 'SpatExtent' [package "terra"]
## [1] 108.5986
## [1] "ID_0" "ISO" "NAME_0" "NAME_1" "NAME_2" "Shape_Leng" "Shape_Area" "geometry"
Summarizing spatial information for each column of attribute table [a.t.]:
## SITE_ID NAME_ENG COUNTRY SUB_NAT geometry
## Min. : 783 Length:255 Length:255 Length:255 MULTIPOLYGON :255
## 1st Qu.: 3844 Class :character Class :character Class :character epsg:4326 : 0
## Median : 9842 Mode :character Mode :character Mode :character +proj=long...: 0
## Mean : 45309
## 3rd Qu.: 19582
## Max. :901238
## Classes 'sf' and 'data.frame': 255 obs. of 5 variables:
## $ SITE_ID : int 785 787 791 795 1300 783 784 786 788 789 ...
## $ NAME_ENG: chr "Kinabalu" "Mulu" "Niah" "Tawau Hill Park" ...
## $ COUNTRY : chr "Malaysia" "Malaysia" "Malaysia" "Malaysia" ...
## $ SUB_NAT : chr "Sabah" "Sarawak" "Sarawak" "Sabah" ...
## $ geometry:sfc_MULTIPOLYGON of length 255; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:322, 1:2] 117 117 117 117 117 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA
## ..- attr(*, "names")= chr [1:4] "SITE_ID" "NAME_ENG" "COUNTRY" "SUB_NAT"
## Simple feature collection with 255 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5969 ymin: -4.18356 xmax: 119.6176 ymax: 9.911229
## Geodetic CRS: WGS 84
## First 10 features:
## SITE_ID NAME_ENG COUNTRY SUB_NAT geometry
## 1 785 Kinabalu Malaysia Sabah MULTIPOLYGON (((116.5979 6....
## 2 787 Mulu Malaysia Sarawak MULTIPOLYGON (((114.9033 4....
## 3 791 Niah Malaysia Sarawak MULTIPOLYGON (((113.7986 3....
## 4 795 Tawau Hill Park Malaysia Sabah MULTIPOLYGON (((117.8497 4....
## 5 1300 Lanjak-Entimau Malaysia Sarawak MULTIPOLYGON (((112.2054 1....
## 6 783 Semporna Malaysia Sabah MULTIPOLYGON (((118.7796 4....
## 7 784 Samunsam Malaysia Sarawak MULTIPOLYGON (((109.6615 1....
## 8 786 Similajau Malaysia Sarawak MULTIPOLYGON (((113.1826 3....
## 9 788 Klias Malaysia Sabah MULTIPOLYGON (((115.636 5.3...
## 10 789 Pulau Tiga Malaysia Sabah MULTIPOLYGON (((115.6796 5....
## $names
## [1] "SITE_ID" "NAME_ENG" "COUNTRY" "SUB_NAT" "geometry"
##
## $row.names
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
## [39] 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
## [77] 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
## [115] 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152
## [153] 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190
## [191] 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228
## [229] 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255
##
## $sf_column
## [1] "geometry"
##
## $agr
## SITE_ID NAME_ENG COUNTRY SUB_NAT
## <NA> <NA> <NA> <NA>
## Levels: constant aggregate identity
##
## $class
## [1] "sf" "data.frame"
Have a look at the content:
## Simple feature collection with 6 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 111.8754 ymin: 1.32105 xmax: 118.7899 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID NAME_ENG COUNTRY SUB_NAT geometry
## 1 785 Kinabalu Malaysia Sabah MULTIPOLYGON (((116.5979 6....
## 2 787 Mulu Malaysia Sarawak MULTIPOLYGON (((114.9033 4....
## 3 791 Niah Malaysia Sarawak MULTIPOLYGON (((113.7986 3....
## 4 795 Tawau Hill Park Malaysia Sabah MULTIPOLYGON (((117.8497 4....
## 5 1300 Lanjak-Entimau Malaysia Sarawak MULTIPOLYGON (((112.2054 1....
## 6 783 Semporna Malaysia Sabah MULTIPOLYGON (((118.7796 4....
## Simple feature collection with 6 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 111.4107 ymin: -3.035619 xmax: 118.2907 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID NAME_ENG COUNTRY SUB_NAT geometry
## 250 317300 Maganting Island Malaysia Sabah MULTIPOLYGON (((118.2839 4....
## 251 317301 Bod Tai Malaysia Sabah MULTIPOLYGON (((118.2182 5....
## 252 342562 Tanjung Penghujan NR/RP Indonesia <NA> MULTIPOLYGON (((111.4377 -2...
## 253 901236 Tasek Merimbun Brunei Darussalam <NA> MULTIPOLYGON (((114.7238 4....
## 254 901237 Kinabalu National Park Malaysia <NA> MULTIPOLYGON (((116.5979 6....
## 255 901238 Mulu National Park Malaysia <NA> MULTIPOLYGON (((114.9033 4....
Retrieving information of a.t. — recap working with
data.frames from Day1_R-Intro course.
## Simple feature collection with 1 feature and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 116.4638 ymin: 5.99952 xmax: 116.7631 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID NAME_ENG COUNTRY SUB_NAT geometry
## 1 785 Kinabalu Malaysia Sabah MULTIPOLYGON (((116.5979 6....
## Simple feature collection with 255 features and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 108.5969 ymin: -4.18356 xmax: 119.6176 ymax: 9.911229
## Geodetic CRS: WGS 84
## First 10 features:
## NAME_ENG geometry
## 1 Kinabalu MULTIPOLYGON (((116.5979 6....
## 2 Mulu MULTIPOLYGON (((114.9033 4....
## 3 Niah MULTIPOLYGON (((113.7986 3....
## 4 Tawau Hill Park MULTIPOLYGON (((117.8497 4....
## 5 Lanjak-Entimau MULTIPOLYGON (((112.2054 1....
## 6 Semporna MULTIPOLYGON (((118.7796 4....
## 7 Samunsam MULTIPOLYGON (((109.6615 1....
## 8 Similajau MULTIPOLYGON (((113.1826 3....
## 9 Klias MULTIPOLYGON (((115.636 5.3...
## 10 Pulau Tiga MULTIPOLYGON (((115.6796 5....
## PA_shp$NAME_ENG ## returns a long list
head(x = PA_shp$NAME_ENG) ## using fct head() to only show the first 6 entries## [1] "Kinabalu" "Mulu" "Niah" "Tawau Hill Park" "Lanjak-Entimau" "Semporna"
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 116.4638 ymin: 5.99952 xmax: 116.7631 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID geometry
## 1 785 MULTIPOLYGON (((116.5979 6....
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 114.7777 ymin: 3.9186 xmax: 114.9945 ymax: 4.28586
## Geodetic CRS: WGS 84
## COUNTRY geometry
## 2 Malaysia MULTIPOLYGON (((114.9033 4....
The following returns the row indices of the data frame or a.t., respectively:
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
## [39] 51 52 53 54 55 56 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 80 81 101 102 103 104 113 114 115 116 117 118 119 120 121 122 123
## [77] 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 161 162
## [115] 163 164 165 166 167 169 170 171 172 185 186 187 190 191 192 193 194 195 196 221 222 223 224 225 226 227 228 230 233 237 238 239 240 241 249 250 251 254
## [153] 255
Visualising vector data
{ggplot2} package
You can use many packages, e.g. {ggplot2}, to plot
{sf} objects: https://www.r-spatial.org/r/2018/10/25/ggplot2-sf.html.
ggplot(data = PA_shp) +
geom_sf(fill = "chartreuse3", color = NA) +
labs(x = "Longitude", y = "Latitude",
title = "Protected areas") +
theme_minimal(base_size = 15) ## set custom plot styleNote that it does NOT work for {sp} objects
(SpatialPolygonsDataFrame).
{tmap} package
plot()
It is also possible to use a simple plot() function for
{sf} objects. However, mind that this object has 7 columns,
so each column holds information that will be plotted, i.e. you will
plot 7 single plots. So do a little transformation: either plot just one
column, or define it as spatial:
Now we illustrate how to plot objects created with the
{sp} package.
Here, the basic plot function works:
spplot() for sp Objects
There is also a special function for plotting {sp}
objects. However, once again you need to select which information you
want to plot.
## ID_0 ISO NAME_0 NAME_1 NAME_2 Shape_Leng Shape_Area
## 1 158 MYS Malaysia Sabah Lahad Datu 6.7475367 0.62106012
## 2 158 MYS Malaysia Sabah Papar 1.6727672 0.10065591
## 3 158 MYS Malaysia Sabah Penampang 0.9548857 0.03951545
## 4 158 MYS Malaysia Sabah Pensiangan 4.0257859 0.49725547
## 5 158 MYS Malaysia Sabah Pitas 2.9223456 0.11955352
## 6 158 MYS Malaysia Sabah Ranau 2.4815704 0.24027591
# TODO: added raster dependency here, decide if we still want to show this as it's outdated anyway
raster::spplot(Admin_sp[6]) ## only plot one of the geometries, coloured e.g. by shape_length = column 6An one can use {tmap}. For example, we can plot a detail
on the map — plot of a single selected protected area:
tmap_mode(mode = "plot")
tm_shape(shp = Borneo_shp) +
tm_polygons(border.col = "deepskyblue4") +
tm_shape(PA_shp[,1]) +
tm_polygons(border.col = "black") +
tm_shape(PA_shp[1, ]) +
tm_polygons(border.col = "red")Small exercises, and saving vector data
Simplify your life! Work with subsets, select Malaysia, save and plot as shapefile with st_write():
Mal_PA_shp <- subset(PA_shp, PA_shp$COUNTRY == 'Malaysia')
st_write(obj = Mal_PA_shp,
dsn = output_wd,
layer = 'ProtectedAreasMalaysia',
driver = 'ESRI Shapefile',
delete_layer = TRUE)## Deleting layer `ProtectedAreasMalaysia' using driver `ESRI Shapefile'
## Writing layer `ProtectedAreasMalaysia' to data source
## `/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/output' using driver `ESRI Shapefile'
## Writing 153 features with 4 fields and geometry type Multi Polygon.
and plot:
tmap_mode(mode = "plot")
tm_shape(shp = Borneo_shp) +
tm_polygons(border.col = "blue") +
tm_shape(shp = Mal_PA_shp) +
tm_polygons(col = "red")Extract elevation per PA and save average to a.t. as a new column:
fewPA <- Mal_PA_shp[c(1:5), 1]
tmp <- extract(x = ras_bio_asc_24, y = vect(fewPA), xy = TRUE) ## returns a list — each element contains the elevation raster cells (ras_bio_asc_24)
str(tmp)## 'data.frame': 3681 obs. of 4 variables:
## $ ID : num 1 1 1 1 1 1 1 1 1 1 ...
## $ bio_asc_24: int 286 491 508 500 775 986 645 582 335 675 ...
## $ x : num 117 117 117 117 117 ...
## $ y : num 6.49 6.49 6.48 6.48 6.48 ...
Append mean elevation:
## a bit complicated:
# fewPA$mean_elevation <- round(x = unlist(lapply(tmp, FUN = mean, na.rm = TRUE)))
## ...or for now solved stepwise with aggregate()
mean_tmp <- aggregate(tmp$bio_asc_24, by = list(Category = tmp$ID), FUN = mean)
## dplyr approach:
#mean_tmp <- dplyr::summarize(dplyr::group_by(tmp, ID), x = mean(bio_asc_24, na.rm = TRUE))
fewPA$mean_elevation <- mean_tmp$x
fewPA## Simple feature collection with 5 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 111.8754 ymin: 1.32105 xmax: 118.0747 ymax: 6.4952
## Geodetic CRS: WGS 84
## SITE_ID geometry mean_elevation
## 1 785 MULTIPOLYGON (((116.5979 6.... 1348.4446
## 2 787 MULTIPOLYGON (((114.9033 4.... 554.8083
## 3 791 MULTIPOLYGON (((113.7986 3.... 107.7250
## 4 795 MULTIPOLYGON (((117.8497 4.... 512.3037
## 5 1300 MULTIPOLYGON (((112.2054 1.... 403.9674
Geospatial calculation
Using the ‘old’ {sp} package and the
{rgeos} package, commands for calculating properties of
spatial objects were: gArea(), gLength(),
gDistance(). Other often used GIS functions relate to basic
spatial operations, such as gBuffer(),
gCentroid(), gDifference(),
gUnion(), and gIntersection(),
gUnionCascaded() (= dissolve function) and
gSimplify(). Spatial queries were
gIntersects(), gContains(),
gIsValid(). http://www.nickeubank.com/wp-content/uploads/2015/10/RGIS2_MergingSpatialData_part2_GeometricManipulations.html.
The new {sf} package has the same functionality, with
functions starting with st_, e.g. st_buffer(),
st_intersection() etc.
Spatial queries and transformations
Area calculation
Returns the area in square meters:
## Units: [m^2]
## [1] 7651059265 1238543644 486042067 6128271711 1468016456 2955188183
CRS projection
If no CRS is assigned for a file, the CRS can be set with
st_crs(). However, if a file has already a CRS, e.g. WGS84
(angular units, longitude and latitude), and you want to change into a
projection with planar units (e.g. meters), use
st_transform().
Borneo_shp_moll <- st_transform(Borneo_shp, c("+proj=moll +datum=WGS84"))
class(Borneo_shp_moll) # sf object, data.frame!## [1] "sf" "data.frame"
tmap_mode(mode = "plot")
tm_shape(shp = Borneo_shp_moll) +
tm_polygons(border.col = "blue") ## do you see the difference?Note the change in the area calculation!
## Units: [m^2]
## [1] 7666715202 1241315346 487129643 6141990607 1471302821 2961802507
We can also use {ggplot2}:
ggplot(Borneo_shp_moll) +
geom_sf(color = "blue") +
theme_minimal(base_size = 15) ## set custom plot styleExample: Area of Malaysia in Borneo
Mal_Borneo_shp <- subset(Borneo_shp_moll, Borneo_shp_moll$NAME_0 == 'Malaysia')
head(st_area(x = Mal_Borneo_shp) / 1000000) ## or / 1e6## Units: [m^2]
## [1] 7666.7152 1241.3153 487.1296 6141.9906 1471.3028 2961.8025
## better: use set_units to change the units from m^2 to km^2
Mal_Borneo_shp$area <- units::set_units(x = st_area(x = Mal_Borneo_shp), value = km^2)
head(Mal_Borneo_shp$area)## Units: [km^2]
## [1] 7666.7152 1241.3153 487.1296 6141.9906 1471.3028 2961.8025
Example: Calculate area of a polygon and add the area of a polygon as new column to a.t.
## 487129643 [m^2]
## Units: [m^2]
## [1] 7666715202 1241315346 487129643 6141990607 1471302821 2961802507
area_km2 <- set_units(x = st_area(x = Borneo_shp_moll, byid = TRUE), value = km^2)
Borneo_shp_moll = data.frame(Borneo_shp_moll, area_km2)
head(x = Borneo_shp_moll)## ID_0 ISO NAME_0 NAME_1 NAME_2 Shape_Leng Shape_Area geometry area_km2
## 1 158 MYS Malaysia Sabah Lahad Datu 6.7475367 0.62106012 MULTIPOLYGON (((11824461 58... 7666.7152 [km^2]
## 2 158 MYS Malaysia Sabah Papar 1.6727672 0.10065591 MULTIPOLYGON (((11590069 72... 1241.3153 [km^2]
## 3 158 MYS Malaysia Sabah Penampang 0.9548857 0.03951545 MULTIPOLYGON (((11592912 72... 487.1296 [km^2]
## 4 158 MYS Malaysia Sabah Pensiangan 4.0257859 0.49725547 MULTIPOLYGON (((11651495 62... 6141.9906 [km^2]
## 5 158 MYS Malaysia Sabah Pitas 2.9223456 0.11955352 MULTIPOLYGON (((11703665 83... 1471.3028 [km^2]
## 6 158 MYS Malaysia Sabah Ranau 2.4815704 0.24027591 MULTIPOLYGON (((11675075 77... 2961.8025 [km^2]
The following for {sp} objects:
# TODO: decide if this can/should be removed
#gArea(Borneo_shp_moll, byid=TRUE) / 1e6 ## does not work, as Borneo_shp_moll is sf-object!
head(rgeos::gArea(Admin_sp, byid = TRUE) / 1e6) ## what does the warning message mean? -> check CRS!## 1 2 3 4 5 6
## 6.210601e-07 1.006559e-07 3.951545e-08 4.972555e-07 1.195535e-07 2.402759e-07
Admin_sp_moll <- spTransform(Admin_sp, c("+proj=moll +datum=WGS84"))
head(rgeos::gArea(Admin_sp_moll, byid = TRUE) / 1e6) ## now it works!## 1 2 3 4 5 6
## 7666.7152 1241.3153 487.1296 6141.9906 1471.3028 2961.8025
Points
Import spatial points
Usually, shapefiles come with a .proj file defining the
projection, but not always. Please always check whether the projection
is set:
## Reading layer `FCsim' from data source
## `/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/data/data_borneo/animal_data/FCsim.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 55 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 109.3758 ymin: -2.744159 xmax: 118.4117 ymax: 5.824283
## CRS: NA
## Simple feature collection with 55 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 109.3758 ymin: -2.744159 xmax: 118.4117 ymax: 5.824283
## CRS: NA
## First 10 features:
## Id POINT_X POINT_Y geometry
## 1 0 118.4117 5.221667 POINT (118.4117 5.221667)
## 2 0 118.2645 5.005603 POINT (118.2645 5.005603)
## 3 0 117.7992 5.029167 POINT (117.7992 5.029167)
## 4 0 117.6058 5.177957 POINT (117.6058 5.177957)
## 5 0 117.5566 5.381088 POINT (117.5566 5.381088)
## 6 0 117.4704 5.005603 POINT (117.4704 5.005603)
## 7 0 117.2734 5.227201 POINT (117.2734 5.227201)
## 8 0 116.7256 5.319533 POINT (116.7256 5.319533)
## 9 0 116.6467 5.162500 POINT (116.6467 5.1625)
## 10 0 117.2475 4.888333 POINT (117.2475 4.888333)
## Simple feature collection with 55 features and 3 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 109.3758 ymin: -2.744159 xmax: 118.4117 ymax: 5.824283
## Geodetic CRS: WGS 84
## First 10 features:
## Id POINT_X POINT_Y geometry
## 1 0 118.4117 5.221667 POINT (118.4117 5.221667)
## 2 0 118.2645 5.005603 POINT (118.2645 5.005603)
## 3 0 117.7992 5.029167 POINT (117.7992 5.029167)
## 4 0 117.6058 5.177957 POINT (117.6058 5.177957)
## 5 0 117.5566 5.381088 POINT (117.5566 5.381088)
## 6 0 117.4704 5.005603 POINT (117.4704 5.005603)
## 7 0 117.2734 5.227201 POINT (117.2734 5.227201)
## 8 0 116.7256 5.319533 POINT (116.7256 5.319533)
## 9 0 116.6467 5.162500 POINT (116.6467 5.1625)
## 10 0 117.2475 4.888333 POINT (117.2475 4.888333)
Spatial points are defined by two columns with x and y coordinates,
can be imported as normal data.frames and then converted to
spatial objects:
pt_file <- paste0(anim_wd, "/MyNewSpecies.csv")
df_recs <- read.table(file = pt_file, header = TRUE, sep = ',') # animal records
class(x = df_recs)## [1] "data.frame"
## species long lat
## 1 mysimsp 109.4716 -1.11984615
## 2 mysimsp 109.3216 -0.02817942
## 3 mysimsp 117.0383 3.60515411
## 4 mysimsp 109.1550 0.09682059
## 5 mysimsp 111.1550 1.78848735
## 6 mysimsp 115.4550 4.93848752
See below of how to convert a data.frame into a spatial
{sf} object.
Plot the points
Simple data frames with location information, as long as the coordinates are the same, can also be plotted in space. We first plot the polygon:
plot(x = as(Borneo_shp, "Spatial"), col = 'grey', border = 'white') ## polygon
points(x = df_recs$long, df_recs$lat, cex = 0.5, pch = 15) ## simple d.f.!
plot(x = as(pt_shp, "Spatial"), col = 'blue', add = TRUE) {tmap} can only handle spatial objects, so the data
frame df_recs needs to be converted into an
{sf} object first.
You have to define which columns contain the coordinates via
coords =:
recs_sf <- st_as_sf(x = data.frame(df_recs),
coords = c("long", "lat"),
crs = 4326,
sf_column_name = "geometry")tmap_mode(mode = "plot")
tm_shape(shp = Borneo_shp) +
tm_polygons(col = "grey", border.col = "white") +
tm_shape(shp = recs_sf) +
tm_dots(shape = 3, size = 0.25) +
tm_shape(shp = pt_shp) +
tm_dots(col = "blue") Spatial overlay
Vector data
## retrieve the geometry (location) indices of PA_shp at
## the locations of sp_recs: which points are in PA_shp
nrow(recs_sf) ## 500## [1] 500
## [1] 11
Vector and raster data interaction
# for a RASTER: extract mean ann. temp. from ras_bio_asc_01
# and add it to a.t.of the locations/ points
mean_t <- extract(x = ras_bio_asc_01, y = vect(recs_sf)) ## for {terra} we need to wrap the sf object into spatial vector with`vect()`
recs_sf$mean_t <- mean_t$bio_asc_01
mean(x = recs_sf$mean_t) # hist(sp_recs_sf$mean_t)## [1] 272.7788
Saving vector data
Save as ESRI shapefile to import in any GIS
st_write(obj = insidePA,
dsn = output_wd,
layer = "inPA",
driver = "ESRI Shapefile",
delete_layer = TRUE)## Deleting layer `inPA' using driver `ESRI Shapefile'
## Writing layer `inPA' to data source
## `/Users/cedric/Library/CloudStorage/GoogleDrive-cedricphilippscherer@gmail.com/My Drive/Work/Eco/d6_teaching_collection/output' using driver `ESRI Shapefile'
## Writing 11 features with 5 fields and geometry type Point.
Data Sources
{rnaturalearth}
NaturalEarth (http://www.naturalearthdata.com) is a public domain map data set that features vector and raster data of physical and cultural properties. It is available at 1:10m, 1:50m, and 1:110 million scales.
{rnaturalearth}
(https://docs.ropensci.org/rnaturalearth) is an R package
to hold and facilitate interaction with NaturalEarth (ne_…) map
data.
After loading the package, you can for example quickly access
shapefiles of all countries that are already projected and can be stored
as either sp or sf objects:
library(rnaturalearth)
## store as sp object (SpatialPolygonsDataFrame)
world <- ne_countries() ## `returnclass = "sp"` by default
class(world)## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
## [1] "sf" "data.frame"
## $input
## [1] "WGS 84"
This country data set (which is actually not downloaded but stored locally by installing the package) already contains several useful variables, mostly referring to different naming conventions (helpful when joining with other data sets), to identify continents and regions, and also some information on population size, GDP, and economy:
## [1] "featurecla" "scalerank" "labelrank" "sovereignt" "sov_a3" "adm0_dif" "level" "type" "tlc" "admin" "adm0_a3"
## [12] "geou_dif" "geounit" "gu_a3" "su_dif" "subunit" "su_a3" "brk_diff" "name" "name_long" "brk_a3" "brk_name"
## [23] "brk_group" "abbrev" "postal" "formal_en" "formal_fr" "name_ciawf" "note_adm0" "note_brk" "name_sort" "name_alt" "mapcolor7"
## [34] "mapcolor8" "mapcolor9" "mapcolor13" "pop_est" "pop_rank" "pop_year" "gdp_md" "gdp_year" "economy" "income_grp" "fips_10"
## [45] "iso_a2" "iso_a2_eh" "iso_a3" "iso_a3_eh" "iso_n3" "iso_n3_eh" "un_a3" "wb_a2" "wb_a3" "woe_id" "woe_id_eh"
## [56] "woe_note" "adm0_iso" "adm0_diff" "adm0_tlc" "adm0_a3_us" "adm0_a3_fr" "adm0_a3_ru" "adm0_a3_es" "adm0_a3_cn" "adm0_a3_tw" "adm0_a3_in"
## [67] "adm0_a3_np" "adm0_a3_pk" "adm0_a3_de" "adm0_a3_gb" "adm0_a3_br" "adm0_a3_il" "adm0_a3_ps" "adm0_a3_sa" "adm0_a3_eg" "adm0_a3_ma" "adm0_a3_pt"
## [78] "adm0_a3_ar" "adm0_a3_jp" "adm0_a3_ko" "adm0_a3_vn" "adm0_a3_tr" "adm0_a3_id" "adm0_a3_pl" "adm0_a3_gr" "adm0_a3_it" "adm0_a3_nl" "adm0_a3_se"
## [89] "adm0_a3_bd" "adm0_a3_ua" "adm0_a3_un" "adm0_a3_wb" "continent" "region_un" "subregion" "region_wb" "name_len" "long_len" "abbrev_len"
## [100] "tiny" "homepart" "min_zoom" "min_label" "max_label" "label_x" "label_y" "ne_id" "wikidataid" "name_ar" "name_bn"
## [111] "name_de" "name_en" "name_es" "name_fa" "name_fr" "name_el" "name_he" "name_hi" "name_hu" "name_id" "name_it"
## [122] "name_ja" "name_ko" "name_nl" "name_pl" "name_pt" "name_ru" "name_sv" "name_tr" "name_uk" "name_ur" "name_vi"
## [133] "name_zh" "name_zht" "fclass_iso" "tlc_diff" "fclass_tlc" "fclass_us" "fclass_fr" "fclass_ru" "fclass_es" "fclass_cn" "fclass_tw"
## [144] "fclass_in" "fclass_np" "fclass_pk" "fclass_de" "fclass_gb" "fclass_br" "fclass_il" "fclass_ps" "fclass_sa" "fclass_eg" "fclass_ma"
## [155] "fclass_pt" "fclass_ar" "fclass_jp" "fclass_ko" "fclass_vn" "fclass_tr" "fclass_id" "fclass_pl" "fclass_gr" "fclass_it" "fclass_nl"
## [166] "fclass_se" "fclass_bd" "fclass_ua" "geometry"
We can quickly plot it:
NOTE: Unfortunately, NaturalEarth is using weird de-facto and on-the-ground rules to define country borders which do not follow the borders the UN and most countries agree on. For correct and official borders, please use the {rgeoboundaries} package (see below).
You can specify the scale, category and type you want as in the examples below.
glacier_small <- ne_download(type = "glaciated_areas", category = "physical",
scale = "small", returnclass = "sf")
glacier_large <- ne_download(type = "glaciated_areas", category = "physical",
scale = "large", returnclass = "sf")
ggplot() +
geom_sf(data = world, color = "grey80", fill ="grey80") +
geom_sf(data = glacier_small, color = "grey40", fill = "grey40") +
coord_sf(crs = "+proj=eqearth") +
theme_void()
ggplot() +
geom_sf(data = world, color = "grey80", fill ="grey80") +
geom_sf(data = glacier_large, color = "grey40", fill = "grey40") +
coord_sf(crs = "+proj=eqearth") +
theme_void()Relief data
{rgeoboundaries}
The {rgeoboundaries}
package uses the Global Database of Political Administrative
Boundaries that provide generally accepted political borders. The data
are licensed openly.
library(rgeoboundaries)
ggplot(gb_adm0()) +
geom_sf(color = "grey40", lwd = .2) +
coord_sf(crs = "+proj=eqearth") +
theme_void()Lower administrative levels are available as well, e.g. in Germany adm1 represents federal states (“Bundesländer”), adm2 districts (“Kreise”) and so on.
Let’s plot the admin 1 levels for the DACH countries:
dach <- gb_adm1(c("germany", "switzerland", "austria"), type = "sscgs")
ggplot(dach) +
geom_sf(aes(fill = shapeGroup)) +
scale_fill_brewer(palette = "Set2") +
theme_void(){osmdata}
OpenStreetMap (https://www.openstreetmap.org) is a collaborative
project to create a free editable geographic database of the world. The
geodata underlying the maps is considered the primary output of the
project and is accessible from R via the {osmdata}
package.
We first need to define our query and limit it to a region. You can explore the features and tags (also available as information via OpenStreetMap directly).
## [1] "4wd_only" "abandoned" "abutters" "access" "addr" "addr:city"
## # A tibble: 6 × 2
## Key Value
## <chr> <chr>
## 1 craft agricultural_engines
## 2 craft atelier
## 3 craft bakery
## 4 craft basket_maker
## 5 craft beekeeper
## 6 craft blacksmith
## building the query, e.g. beekeepers
beekeeper_query <-
## you can automatically retrieve a boudning box (pr specify one manually)
getbb("Berlin") %>%
## build an Overpass query
opq() %>%
## access particular feature
add_osm_feature("craft", "beekeeper")
## download data
sf_beekeepers <- osmdata_sf(beekeeper_query)Now we can investigate beekeepers in Berlin:
## [1] "bbox" "overpass_call" "meta" "osm_points" "osm_lines" "osm_polygons" "osm_multilines"
## [8] "osm_multipolygons"
## Simple feature collection with 6 features and 26 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 13.24443 ymin: 52.35861 xmax: 13.69093 ymax: 52.573
## Geodetic CRS: WGS 84
## osm_id name addr:city addr:country addr:housenumber addr:postcode addr:street addr:suburb check_date contact:phone contact:website craft email
## 358407135 358407135 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 358407138 358407138 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 417509803 417509803 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 417509805 417509805 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 597668310 597668310 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 597668311 597668311 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## facebook instagram man_made opening_hours operator organic phone product shop source website wheelchair works geometry
## 358407135 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> POINT (13.69068 52.35918)
## 358407138 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> POINT (13.69093 52.35894)
## 417509803 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> POINT (13.68991 52.35888)
## 417509805 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> POINT (13.6902 52.35861)
## 597668310 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> POINT (13.24445 52.573)
## 597668311 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> POINT (13.24443 52.57295)
beekeper_locations <- sf_beekeepers$osm_points
gb_ber <- gb_adm1(c("germany"), type = "sscgs")[3,] # the third element is Berlin
ggplot(beekeper_locations) +
geom_sf(data = gb_ber) +
# geom_sf(data = d6berlin::sf_berlin) + ## alternative to geoboundaries, but d6berlin needs to be loaded first
geom_sf(size = 2) +
theme_void(){elevatr}
The {elevatr}
(https://github.com/jhollist/elevatr/) is an R package
that provides access to elevation data from AWS Open Data Terrain Tiles
and the Open Topography Global data sets API for raster digital
elevation models (DEMs).
We first need to define a location or bounding box for our elevation
data. This can either be a data frame or a spatial object. We use an
sf object which holds the projection to be used when
assessing the elevation data:
library(elevatr)
## manually specify corners of the bounding box of the US
bbox_usa <- data.frame(x = c(-125.0011, -66.9326),
y = c(24.9493, 49.5904))
## turn into spatial, projected bounding box
sf_bbox_usa <- st_as_sf(bbox_usa, coords = c("x", "y"), crs = 4326)Now we can download the elevation data with a specified resolution z (ranging from 1 to 14 with 1 being very coarse and 14 being very fine).
An Advanced Map with {tmap}
At the end, an example of a beautiful map, created with
tmap:
tmap_mode(mode = "plot")
tm_shape(shp = hillsh) +
## hillshading
tm_raster(palette = "Greys",
legend.show = FALSE,
alpha = 0.75) +
## topographical model
tm_shape(shp = ras_bio_asc_24) +
tm_raster(palette = terrain.colors(25),
alpha = 0.2,
legend.show = FALSE) +
## rivers
tm_shape(shp = River_shp) +
tm_lines(col = "dodgerblue3") +
## protected areas
tm_shape(shp = PA_shp) +
tm_polygons(border.col = "white",
alpha = 0) +
## locations
tm_shape(shp = recs_sf) +
tm_dots(shape = 16,
size = 0.3) +
tm_shape(shp = insidePA) +
tm_dots(col = "red",
size = 1,
shape = 3) +
## styling
tm_layout(title = "END OF SESSION",
title.color = "darkgrey",
title.position = c(0.05, 0.9),
title.size = 2)Recap: the most important functions for spatial data
Load spatial data
Exercise
**Revisit the data set from course 1 on wild boar observations in
Berlin: data_wb_melden_en.csv. (it is here:
....\d6_teaching_collection\data\data_berlin\animal_data).
Now, we want to study the spatial patterns of the wild boar observations. We may hypothesize that wild boar observations are also related to local differences in weather. Answer the following question using spatial data sets and visualizations:**
- Question 1.1) Were wild boar with many piglets seen more often in warm parts of the city?
- Question 1.2) Were wild boars more often observed at colder spots of the city during sunny weather?
- Question 1.3) Additional question for the fast ones: Were large groups of wild boar more frequently seen in areas providing a dense tree cover?
Follow these steps to answer the questions:
- Load the wild boar data (
data_wb_melden_en.csv) and spatial data on temperature in Berlin (summer_temp_100m_3035.ascindata_berlin). Remember to set the correct CRS to the raster! (Hint: we need the Lambert Azimuthal Equal Area (LAEA) projection for Europe. Tip: always save the CRS as EPSG code to the filename ;-) )
- Load the wild boar data (
- Turn the wild boar data into a simple features object. Remember to set the correct CRS (the wild boar locations were collected in WGS 84) and to transform it to the same CRS as the raster afterwards!
- Inspect both spatial data sets by plotting the temperature raster with the wild boar locations on top using the R basic plot function. The locations should match the map (if not something went wrong when setting the CRS; check if they are the same).
- Select one of the temperature layers and extract the values for each wild boar location.
- Create a plot that visualizes the temperature at each wild boar observation with 3 or more piglets (y axis) for each number of piglets (e.g. make a boxplot and plot the raw data on top).
- Create a new variable that holds the weather category as either “sunny” or “other”.
- Visualize the temperature at each wild boar observation (y axis) for sunny and other weather (box and whisker plot + raw data).
- Save the wild boar observations with the local temperature information as an .Rds file.
For the additional question:
- Load the tree cover data set
(
tree_cover_density_2012_100m_3035.ascindata_berlin).
- Load the tree cover data set
(
- Inspect the data set by plotting the raster using the R basic plot function.
- Extract the tree cover within a buffer of 100m around each wild boar
location (hint: use the help to inspect the arguments of
extract()).
- Extract the tree cover within a buffer of 100m around each wild boar
location (hint: use the help to inspect the arguments of
- Create a boxplot that shows the tree cover (y axis) based on the group size.
Session Info
## R version 4.3.0 (2023-04-21)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.2.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8
##
## time zone: Europe/Berlin
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] elevatr_0.99.0 osmdata_0.2.5 rgeoboundaries_1.2.9000 rnaturalearth_0.3.4.9000 tanaka_0.4.0 here_1.0.1
## [7] units_0.8-5 patchwork_1.1.2 viridis_0.6.4 viridisLite_0.4.2 tmap_3.3-4 ggplot2_3.4.3
## [13] stars_0.6-4 abind_1.4-5 sp_2.1-2 sf_1.0-14 rasterVis_0.51.6 lattice_0.22-5
## [19] terra_1.7-55
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 rstudioapi_0.15.0 jsonlite_1.8.8 wk_0.9.1 magrittr_2.0.3 farver_2.1.1
## [7] rmarkdown_2.20 fs_1.6.3 ragg_1.2.5 vctrs_0.6.3 memoise_2.0.1 hoardr_0.5.3
## [13] mapiso_0.3.0 base64enc_0.1-3 progress_1.2.2 htmltools_0.5.6 leafsync_0.1.0 usethis_2.2.2
## [19] curl_5.1.0 raster_3.6-26 s2_1.1.4 sass_0.4.7 slippymath_0.3.1 KernSmooth_2.23-20
## [25] bslib_0.5.1 htmlwidgets_1.6.2 httr2_0.2.3 lubridate_1.9.2 zoo_1.8-12 cachem_1.0.8
## [31] mime_0.12 lifecycle_1.0.4 pkgconfig_2.0.3 R6_2.5.1 fastmap_1.1.1 shiny_1.7.5
## [37] selectr_0.4-2 digest_0.6.33 colorspace_2.1-0 ps_1.7.5 rprojroot_2.0.3 leafem_0.2.3
## [43] pkgload_1.3.2.1 textshaping_0.3.6 crosstalk_1.2.0 maplegend_0.1.0 lwgeom_0.2-13 labeling_0.4.2
## [49] progressr_0.14.0 urltools_1.7.3 timechange_0.2.0 fansi_1.0.4 httr_1.4.7 compiler_4.3.0
## [55] proxy_0.4-27 remotes_2.4.2 withr_2.5.0 DBI_1.1.3 hexbin_1.28.3 pkgbuild_1.4.2
## [61] highr_0.10 rappdirs_0.3.3 sessioninfo_1.2.2 tmaptools_3.1-1 leaflet_2.2.1 classInt_0.4-10
## [67] tools_4.3.0 httpuv_1.6.11 glue_1.6.2 callr_3.7.3 promises_1.2.1 grid_4.3.0
## [73] generics_0.1.3 isoband_0.2.7 gtable_0.3.4 countrycode_1.5.0 leaflet.providers_2.0.0 class_7.3-21
## [79] rmdformats_1.0.4 hms_1.1.2 xml2_1.3.3 utf8_1.2.3 pillar_1.9.0 stringr_1.5.0
## [85] later_1.3.1 dplyr_1.1.0 deldir_2.0-2 tidyselect_1.2.0 miniUI_0.1.1.1 knitr_1.42
## [91] gridExtra_2.3 bookdown_0.35 crul_1.4.0 xfun_0.40 devtools_2.4.5 stringi_1.7.12
## [97] yaml_2.3.7 evaluate_0.20 codetools_0.2-19 httpcode_0.3.0 interp_1.1-5 tibble_3.2.1
## [103] cli_3.6.1 xtable_1.8-4 systemfonts_1.0.4 munsell_0.5.0 processx_3.8.2 jquerylib_0.1.4
## [109] dichromat_2.0-0.1 Rcpp_1.0.11 triebeard_0.4.1 png_0.1-8 XML_3.99-0.14 parallel_4.3.0
## [115] ellipsis_0.3.2 prettyunits_1.1.1 latticeExtra_0.6-30 jpeg_0.1-10 profvis_0.3.8 urlchecker_1.0.1
## [121] scales_1.2.1 e1071_1.7-14 purrr_1.0.1 crayon_1.5.2 rlang_1.1.2 rvest_1.0.3
## [127] rgeos_0.6-2